Taraba State
Supplementary Information
The claim and evidence conflict pairs can be found at https://huggingface. The scope of our dataset is purely for scientific research. Conflict V erification: Ensuring that the default and conflict evidence are contradictory. The human evaluation results showed a high level of accuracy in our data generation process. We select models with 2B and 7B parameters for our analysis. MA2 [ Touvron et al., 2023 ] is a popular open-source foundation model, trained on 2T Models with 7B and 70B parameters are selected for our analysis. To facilitate parallel training, we employ DeepSpeed Zero-Stage 3 [ Ren et al., The prompt for generating semantic conflict descriptions is shown in Figure 1 . The prompt for generating default evidence is shown in Table 6 . The prompt for generating misinformation conflict evidence is shown in Table 7 . The prompt for generating temporal conflict evidence is shown in Table 8 . The prompt for generating semantic conflict evidence is shown in Table 9 .
- Europe > Czechia > Liberec Region > Liberec (0.05)
- Africa > Nigeria > Taraba State (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- (3 more...)
- Personal > Honors (0.69)
- Research Report > New Finding (0.68)
A Benchmark for Evaluating Knowledge Conflicts in Large Language Models
Large language models (LLMs) have achieved impressive advancements across numerous disciplines, yet the critical issue of knowledge conflicts, a major source of hallucinations, has rarely been studied. While a few research explored the conflicts between the inherent knowledge of LLMs and the retrieved contextual knowledge, a comprehensive assessment of knowledge conflict in LLMs is still missing.
- Europe > Czechia > Liberec Region > Liberec (0.04)
- Asia > Middle East > Jordan (0.04)
- Africa > Nigeria > Taraba State (0.04)
- (12 more...)
- Personal > Honors (1.00)
- Research Report > New Finding (0.93)
Supplementary Information
The claim and evidence conflict pairs can be found at https://huggingface. The scope of our dataset is purely for scientific research. Conflict V erification: Ensuring that the default and conflict evidence are contradictory. The human evaluation results showed a high level of accuracy in our data generation process. We select models with 2B and 7B parameters for our analysis. MA2 [ Touvron et al., 2023 ] is a popular open-source foundation model, trained on 2T Models with 7B and 70B parameters are selected for our analysis. To facilitate parallel training, we employ DeepSpeed Zero-Stage 3 [ Ren et al., The prompt for generating semantic conflict descriptions is shown in Figure 1 . The prompt for generating default evidence is shown in Table 6 . The prompt for generating misinformation conflict evidence is shown in Table 7 . The prompt for generating temporal conflict evidence is shown in Table 8 . The prompt for generating semantic conflict evidence is shown in Table 9 .
- Europe > Czechia > Liberec Region > Liberec (0.05)
- Africa > Nigeria > Taraba State (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- (3 more...)
- Personal > Honors (0.69)
- Research Report > New Finding (0.68)
A Benchmark for Evaluating Knowledge Conflicts in Large Language Models
Large language models (LLMs) have achieved impressive advancements across numerous disciplines, yet the critical issue of knowledge conflicts, a major source of hallucinations, has rarely been studied. While a few research explored the conflicts between the inherent knowledge of LLMs and the retrieved contextual knowledge, a comprehensive assessment of knowledge conflict in LLMs is still missing.
- Europe > Czechia > Liberec Region > Liberec (0.04)
- Asia > Middle East > Jordan (0.04)
- Africa > Nigeria > Taraba State (0.04)
- (12 more...)
- Personal > Honors (1.00)
- Research Report > New Finding (0.93)
How does Misinformation Affect Large Language Model Behaviors and Preferences?
Peng, Miao, Chen, Nuo, Tang, Jianheng, Li, Jia
Large Language Models (LLMs) have shown remarkable capabilities in knowledge-intensive tasks, while they remain vulnerable when encountering misinformation. Existing studies have explored the role of LLMs in combating misinformation, but there is still a lack of fine-grained analysis on the specific aspects and extent to which LLMs are influenced by misinformation. To bridge this gap, we present MisBench, the current largest and most comprehensive benchmark for evaluating LLMs' behavior and knowledge preference toward misinformation. MisBench consists of 10,346,712 pieces of misinformation, which uniquely considers both knowledge-based conflicts and stylistic variations in misinformation. Empirical results reveal that while LLMs demonstrate comparable abilities in discerning misinformation, they still remain susceptible to knowledge conflicts and stylistic variations. Based on these findings, we further propose a novel approach called Reconstruct to Discriminate (RtD) to strengthen LLMs' ability to detect misinformation. Our study provides valuable insights into LLMs' interactions with misinformation, and we believe MisBench can serve as an effective benchmark for evaluating LLM-based detectors and enhancing their reliability in real-world applications. Codes and data are available at https://github.com/GKNL/MisBench.
- Europe > Austria > Vienna (0.14)
- North America > United States > California > Santa Clara County > Stanford (0.05)
- Europe > Germany > Bavaria > Middle Franconia > Nuremberg (0.04)
- (9 more...)
- Research Report > Promising Solution (0.88)
- Overview > Innovation (0.88)
- Research Report > New Finding (0.68)
Corn Yield Prediction Model with Deep Neural Networks for Smallholder Farmer Decision Support System
Olisah, Chollette, Smith, Lyndon, Smith, Melvyn, Morolake, Lawrence, Ojukwu, Osi
Given the nonlinearity of the interaction between weather and soil variables, a novel deep neural network regressor (DNNR) was carefully designed with considerations to the depth, number of neurons of the hidden layers, and the hyperparameters with their optimizations. Additionally, a new metric, the average of absolute root squared error (ARSE) was proposed to address the shortcomings of root mean square error (RMSE) and mean absolute error (MAE) while combining their strengths. Using the ARSE metric, the random forest regressor (RFR) and the extreme gradient boosting regressor (XGBR), were compared with DNNR. The RFR and XGBR achieved yield errors of 0.0000294 t/ha, and 0.000792 t/ha, respectively, compared to the DNNR(s) which achieved 0.0146 t/ha and 0.0209 t/ha, respectively. All errors were impressively small. However, with changes to the explanatory variables to ensure generalizability to unforeseen data, DNNR(s) performed best. The unforeseen data, different from unseen data, is coined to represent sudden and unexplainable change to weather and soil variables due to climate change. Further analysis reveals that a strong interaction does exist between weather and soil variables. Using precipitation and silt, which are strong-negatively and strong-positively correlated with yield, respectively, yield was observed to increase when precipitation was reduced and silt increased, and vice-versa.
- Africa > Nigeria > Enugu State > Enugu (0.05)
- North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.05)
- Africa > Nigeria > Plateau State (0.04)
- (7 more...)
- Research Report > New Finding (0.68)
- Research Report > Experimental Study (0.46)